Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add CBP(),DBP() to Marvell VSP #233

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

alkama-hasan
Copy link
Contributor

This Commits adds CreateBridgePort() and DeleteBridgePort() API Support for Marvell VSP

@openshift-ci openshift-ci bot requested review from bn222 and vrindle December 19, 2024 13:18
Copy link

openshift-ci bot commented Dec 19, 2024

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: alkama-hasan
Once this PR has been reviewed and has the lgtm label, please assign saldaniele for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Dec 19, 2024
Copy link

openshift-ci bot commented Dec 19, 2024

Hi @alkama-hasan. Thanks for your PR.

I'm waiting for a openshift member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@SalDaniele
Copy link
Contributor

/ok-to-test

@openshift-ci openshift-ci bot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Dec 19, 2024
@SalDaniele
Copy link
Contributor

/test make-e2e-test

@@ -43,6 +43,10 @@ spec:
mountPropagation: Bidirectional
- name: proc
mountPath: /proc
- name: ovs-bin
mountPath: /usr/bin/
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems not good to mount over /usr/bin.

What is in /usr/local/bin? Can you mount over that? That path should also be in $PATH.

@@ -72,6 +90,9 @@ func (vsp *mrvlVspServer) createVethPair(index int) error {
if err := netlink.LinkAdd(vethLink); err != nil {
return err
}
if err := netlink.LinkSetUp(vethLink); err != nil {
return err
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This brings code back that was discussed here #162 (comment) . Why is this necessary?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No not necessary will update it

@@ -100,7 +121,7 @@ func (vsp *mrvlVspServer) createVethPair(index int) error {

func (vsp *mrvlVspServer) createHwLBK() error {
//TODO: Implement HW Loopback
klog.Infof("currently only veth pairs are supported")
vsp.log.Info("Currently only veth pairs are supported")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change (and similar below) could be split to a separate patch, which by itself would be trivial and easy to review. Making the review of the non-trivial part easier.

@@ -174,7 +195,7 @@ func (vsp *mrvlVspServer) GetDeviceHealth(nfInterfaceName string) string {
}
return "Healthy"
case "hwlbk":
return "Healthy" //TODO: Implement HW Loopback
return "Unhealthy" //TODO: Implement HW Loopback
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this change? This could be a separate commit, then the commit message could explain why this is done.

return &pb.IpPort{}, err
}
// Initialize Marvell Data Path
vsp.bridgeName = "br0" // TODO: example name discuss on it
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"br0" doesn't seem good to me. Maybe br-mrv0.

In any case, I think there should be a string constant, as the string "br0" is used multiple times in this commit.

if vfcnt < 0 {
return nil, errors.New("invalid VF Count")
}
cmd := exec.Command("sh", "-c", fmt.Sprintf("echo %d > /sys/bus/pci/devices/0000:17:00.0/sriov_numvfs", vfcnt))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

doesn't somebody need to take care that if sriov_numvfs is not zero, that it first gets reset to zero? Who does that or how is this supposed to work?

const (
ovsDBPath = "/var/run/openvswitch/db.sock"
ovsCLIPath = "/usr/local/bin/ovs-vsctl"
deviceID = "a063"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think constants should be CamelCase. Also because there is already ovsDBPath (in this spelling) used for something else.

}
return &pb.IpPort{
Ip: ipPort.Ip,
Port: ipPort.Port,
}, err
}

// getVFName function to get the VF Name of the given BridgePortName on DPU
func (vsp *mrvlVspServer) getVFDetails(BridgePortName string) (string, string, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


// InitDataPlane initializes the data path in this case it creates an ovs bridge
func (ovsdp *OvsDP) InitDataPlane(bridgeName string) error {
ovsdp.log.Info("Initializing OVS-DPDK Data Plane")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When taking with Sushil, he mentioned that only the non-performant netdev path has been upstreamed. Will we still use ovs-dpdk?

@openshift-merge-robot openshift-merge-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jan 9, 2025
@alkama-hasan alkama-hasan force-pushed the mrvl-cbp-dbp branch 2 times, most recently from 1467616 to 6b5a2a2 Compare January 9, 2025 13:00
@openshift-merge-robot openshift-merge-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jan 9, 2025
This Commits adds CreateBridgePort() and DeleteBridgePort() API Support
for Marvell VSP

Signed-off-by: Alkama Hasan <[email protected]>
Copy link

openshift-ci bot commented Jan 10, 2025

@alkama-hasan: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/make-e2e-test b7b391c link true /test make-e2e-test

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

_, err = resetCmd.CombinedOutput()
if err != nil {
return nil, fmt.Errorf("failed to reset sriov_numvfs to 0: %v", err)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks better to me. Obviously, it needs to be tested whether it works well.

Resetting to zero is quite disruptive. Maybe the code should first read the file, and do nothing if it already contains the desired number.

}
return ifname, nil
}

// enableIPV6LinkLocal function to enable the IPv6 Link Local Address on the given Interface Name
// It will return the error
func enableIPV6LinkLocal(interfaceName string) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not directly related to this PR. However, enableIPV6LinkLocal() creates NetworkManager profiles. I don't think that is a good approach.

For one, that's only going to work with systems that have NetworkManager running. While that is the case for RHEL, optimally dpu-operator does not depend on that.

It's also slightly cumbersome to do (you would have to either bind-mount the D-Bus socket in the pod, or as you use nsenter -t 1 -m -u -n -i -- nmcli ... to be in the mount namespace of PID1. Additionally, you need suitable polkit permissions (or run as user root). That latter part is missing, so that we get :

# oc exec -ti pod/vsp-97tbs -- nsenter -t 1 -m -u -n -i -- nmcli connection add type ethernet
Error: Failed to add 'ethernet' connection: Insufficient privileges
command terminated with exit code 4

NetworkManager is IMO great, but not the right solution in this case because NetworkManager is about persistent configuration, when here we just want ad-hoc configure IP addresses.

I think it just needs something like:

ip link set $LINK addrgenmode eui64
ip link set $LINK down
ip link set $LINK up

or maybe use ip token

Let me try to send a patch for what I said here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@thom311 Is this why your VSP was failing on our OCP cluster?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think so. Follow up at #262

}
var pciAddress string
for _, device := range pci.Devices {
if device.Vendor != nil && device.Product != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

on the DPU, I get:

$ lspci -nn | grep a0f7
0002:01:00.1 Ethernet controller [0200]: Cavium, Inc. Octeon Tx2 SDP Virtual Function [177d:a0f7] (rev 54)
0002:01:00.2 Ethernet controller [0200]: Cavium, Inc. Octeon Tx2 SDP Virtual Function [177d:a0f7] (rev 54)
0002:01:00.3 Ethernet controller [0200]: Cavium, Inc. Octeon Tx2 SDP Virtual Function [177d:a0f7] (rev 54)
0002:01:00.4 Ethernet controller [0200]: Cavium, Inc. Octeon Tx2 SDP Virtual Function [177d:a0f7] (rev 54)
0002:01:00.5 Ethernet controller [0200]: Cavium, Inc. Octeon Tx2 SDP Virtual Function [177d:a0f7] (rev 54)
0002:01:00.6 Ethernet controller [0200]: Cavium, Inc. Octeon Tx2 SDP Virtual Function [177d:a0f7] (rev 54)
0002:01:00.7 Ethernet controller [0200]: Cavium, Inc. Octeon Tx2 SDP Virtual Function [177d:a0f7] (rev 54)
0002:01:01.0 Ethernet controller [0200]: Cavium, Inc. Octeon Tx2 SDP Virtual Function [177d:a0f7] (rev 54)
0002:01:01.1 Ethernet controller [0200]: Cavium, Inc. Octeon Tx2 SDP Virtual Function [177d:a0f7] (rev 54)

so there are several PCI devices. How does the code know which one to take? I presume, it wants the PF?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes it wants pf, the code takes first device of the array which is vf0 of pf0 technically first pf
so vf0 is a pf

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't you rely more on the PCI ID? Rather than the array indexes?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If your hardware team says this is ok, then we will just use the array index. But please do your best to see if we can use sysfs. This is potentially fragile to kernel code changes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it also depends on that the golang library ghw.PCI().Devices returns the interfaces in a well defined order. That may not necessarily be the same as lspci lists them or numerically sorted by they PCI address (though, it may well be).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@thom311 has a valid remark. I recommend not making assumptions. Soon, we will have the ability to test without real hardware, and the order of devices returned by this function might not be the same.

@@ -0,0 +1,41 @@
package DebugDP
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add a comment to this file what this is used for?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an example of adding other Data plane for marvell vsp in near future
as of now this can be used to debug all the API call to data plane as we are planning to packages ovs-dpdk data plane for now as a seperate pod.
currently this functionality(debug dp) can be helpful for debugging mrvl-vsp without data plane it is only dummping the info called from vsp.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, sounds useful.

But could you plase add what you just said as a comment? Your comment on github here cannot be found tomorrow, when somebody reviews the code.

It is also not clear to me how to use this in practice. Could you elaborate and give a short usage example?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@thom311 sure i will add in comment.
Regarding usage.
we have a flag in vsp(right now just a const inside main.go future plan isto make it configurable during run time) to choose dataplane where one can opt "debug" to use this funcitonality and this will dump all the everything to dataplane from vsp.
can be used for debug purpose of vsp without having a data plane also

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ok-to-test Indicates a non-member PR verified by an org member that is safe to test.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants